RESEARCH ARTICLE 



Conditionally Rare Taxa Disproportionately Contribute to Temporal 
Changes in Microbial Diversity 

Ashley Shade, 3 Stuart E. Jones, b J. Gregory Caporaso, c ' d Jo Handelsman, e Rob Knight, f '9 Noah Fierer, h '' Jack A. Gilbert^ 

Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, USA"; Department of Biological Sciences, University of Notre 
Dame, Notre Dame, Indiana, USA b ; Institute for Genomic and Systems Biology, Argonne National Laboratory, Argonne, Illinois, USA C ; Department of Biological Sciences, 
Northern Arizona University, Flagstaff, Arizona, USA d ; Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, Connecticut, USA E ; 
Howard Hughes Medical Institute, Boulder, Colorado, USA f ; Department of Chemistry and Biochemistry and BioFrontiers Institute, University of Colorado, Boulder, 
Colorado, USA^; Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, USA h ; Cooperative Institute for Research in Environmenta 
Sciences, University of Colorado, Boulder, Colorado, USA'; Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, USA 

ABSTRACT Microbial communities typically contain many rare taxa that make up the majority of the observed membership, yet 
the contribution of this microbial "rare biosphere" to community dynamics is unclear. Using 16S rRNA amplicon sequencing of 
3,237 samples from 42 time series of microbial communities from nine different ecosystems (air; marine; lake; stream; adult hu- 
man skin, tongue, and gut; infant gut; and brewery wastewater treatment), we introduce a new method to detect typically rare 
microbial taxa that occasionally become very abundant (conditionally rare taxa [CRT] ) and then quantify their contributions to 
temporal shifts in community structure. We discovered that CRT made up 1.5 to 28% of the community membership, repre- 
sented a broad diversity of bacterial and archaeal lineages, and explained large amounts of temporal community dissimilarity 
(i.e., up to 97% of Bray-Curtis dissimilarity). Most of the CRT were detected at multiple time points, though we also identified 
"one-hit wonder" CRT that were observed at only one time point. Using a case study from a temperate lake, we gained additional 
insights into the ecology of CRT by comparing routine community time series to large disturbance events. Our results reveal that 
many rare taxa contribute a greater amount to microbial community dynamics than is apparent from their low proportional 
abundances. This observation was true across a wide range of ecosystems, indicating that these rare taxa are essential for under- 
standing community changes over time. 

IMPORTANCE Microbial communities and their processes are the foundations of ecosystems. The ecological roles of rare microor- 
ganisms are largely unknown, but it is thought that they contribute to community stability by acting as a reservoir that can rap- 
idly respond to environmental changes. We investigated the occurrence of typically rare taxa that very occasionally become more 
prominent in their communities ("conditionally rare"). We quantified conditionally rare taxa in time series from a wide variety 
of ecosystems and discovered that not only were conditionally rare taxa present in all of the examples, but they also contributed 
disproportionately to temporal changes in diversity when they were most abundant. This result indicates an important and gen- 
eral role for rare microbial taxa within their communities. 
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Microbial communities predominate Earth's diverse ecosys- 
tems, contributing immense biomass and underpinning in- 
tegral biogeochemical processes. They sustain the bases of food 
webs, provide key natural products that support human health 
and energy needs, and recycle carbon and nutrients that would 
otherwise stagnate. Despite the central role of microbial commu- 
nities in biological systems, we are just beginning to understand 
the intricate interactions between their members and how these 
interactions contribute to ecosystem functions. Of particular in- 
terest is the role of rare microorganisms within a community, 
which make up the majority of the observed membership at any 
given time (1-5) (see Fig. SI in the supplemental material). De- 
termining whether these taxa remain rare or periodically bloom to 
abundance will change our understanding of each organism's role 



in microbially mediated ecosystem functions and, importantly, in 
the stability of ecosystems in general. 

Rare microbial community members encompass an immense 
diversity (the "rare biosphere") (6-9). Still, the ecological roles of 
the vast majority of rare microorganisms remain unclear. Some 
rare microorganisms are likely on their way to local extinction (8) 
or are transient taxa that are "passing through" an environment 
(10-13). Some rare taxa may even be active, providing important 
functions that are disproportionate to their abundance or growth 
rate (14-16), and others may be dormant or inactive, awaiting 
favorable environmental conditions to grow (17, 18). An increase 
in the abundance of rare microorganisms that "wait" for favorable 
environmental conditions could be attributed to growth from low 
abundance, to awakening from dormancy, or to differential sur- 
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vival (i.e., escape from predation). Though there are a variety of 
ecological explanations for rare-to-prevalent dynamics, we still 
lack general empirical documentation of the phenomenon among 
microbial communities, and so their general incidence remains 
uncertain. 

Because rare microbial taxa are difficult to observe, even less is 
known about their dynamics than is known about their ecological 
roles. A key unknown is how often rare taxa become abundant and 
hence play a potentially greater role in the ecology of a given sys- 
tem. However, there are a small but growing number of studies 
that have documented the dynamics of rare microbial taxa and 
provide some insights. For example, in the Arctic Ocean, rare 
microorganisms exhibited biogeography, indicating that some 
rare taxa, like more abundant taxa, have distributions based on 
their ecological requirements (19). In a sulfide-rich artesian 
spring, rare taxa exhibited patchiness over 1 mm (20), which also 
suggests that rare taxa can have clear distributions at fine spatial 
scales. Additionally, some coastal sand communities have rare 
members that do not often become abundant, suggesting that 
these members have a minimal influence on biogeochemical pro- 
cesses (21). Conversely, in other coastal sand communities, rare 
microbial taxa were shown to be as sensitive as prevalent taxa to 
environmental changes caused by an off-shore oil spill (22). The 
discrepancy between the latter two studies highlights our modest 
knowledge of the potential contributions of rare taxa and espe- 
cially calls into question whether such conclusions are transferable 
to other ecosystems. Therefore, to understand the general impor- 
tance of rare microbial taxa, their contributions to the larger com- 
munity and their dynamics, we must systematically interrogate 
microbial communities from a variety of ecosystems by using con- 
sistent methods. 

The availability of inexpensive, high-throughput sequencing 
technologies has led to an increased number of temporal studies of 
microbial communities (23). One of these studies identified a mi- 
crobial taxon that bloomed to abundance from an apparently per- 
sistently rare state (24, 25). The dynamic of rarity to prevalence 
has also been observed in two other studies of marine bacterio- 
plankton (14, 26). Here, we asked how the pattern of microbial 
rarity to prevalence is manifested in communities inhabiting very 
different ecosystems. We refer to microbial taxa that are typically 
in low abundance in one locality but occasionally become preva- 
lent over time as "conditionally rare." 

Our objective was to understand the incidence of conditionally 
rare taxa (CRT) and their contribution to changes in microbial 
communities through time. We introduce a simple method for 
identifying CRT from temporal studies of diverse microbial com- 
munities and apply this method to a suite of time series data sets 
generated by using 454 pyrosequencing or Illumina sequencing of 
16S rRNA gene fragments. Each data set contained a large percent- 
age of very rare taxa, as typical for microbial communities (see 
Fig. SI in the supplemental material). These data sets were previ- 
ously analyzed by using a closed-reference operational taxonomic 
unit (OTU)-picking protocol (27) for direct comparison of their 
temporal patterns (see Table SI in the supplemental material) 
(28). Because this OTU-picking protocol discards reads that do 
not match reference sequences at a minimum of 97% identity, it 
minimizes the rare OTUs arising from sequencing or PCR errors. 
The closed-reference protocol also avoids the "OTU splitting" 
that may occur when OTUs are defined by using a de novo proto- 
col. We show that within many ecosystems, CRT contributed to 



temporal patterns of microbial diversity disproportionately to 
their relative abundances, suggesting an important role for CRT in 
structuring microbial communities over time. We also explicitly 
examine the influences of sampling frequency, study duration, 
and sequencing depth on the detection of CRT. 

RESULTS 

A simple method for detecting CRT. Conditionally rare dynam- 
ics are exhibited when a taxon that is usually in low abundance or 
below the limit of detection occasionally blooms to an abundance 
appreciable at the community level. Thus, the frequency of a con- 
ditionally rare taxon's abundance over time exhibits a bimodal 
distribution. The lower mode of the distribution is near zero at the 
time points when the taxon was rare or undetected, and the upper 
mode is centered at the taxon's average abundance during a 
"bloom." A statistical method for detecting a bimodal distribution 
is to compute the coefficient of bimodality, b (29). We used this 
coefficient to detect CRT. From the distribution of a taxon's levels 
of abundance through time, the coefficient of bimodality, b, is 
calculated as follows: 

(1 + skewness 2 ) 
(kurtosis+ 3) 
where skewness is defined as follows: 

2? =1 (x; - x) 3 /n 
[2^ =1 (x,-x) 2 /«] M 
and kurtosis is defined as follows: 

V; =l (x,-x) 4 /n 
ZU(x,-x) 2 /n] 2 

The coefficient, b, ranges from 0 to 1, where 1 indicates the 
extreme case of the Bernoulli distribution (as in a binary data set; 
see Fig. S2A in the supplemental material). Thus, we identified 
bimodal taxon abundance distributions and then set a minimum 
relative abundance threshold of >0.01 and confirmed that we 
were able to identify a previously described conditionally rare 
Vibrio taxon in the western English Channel time series (24, 25) 
(see Fig. S2B). We also discovered two additional Vibrio taxa that 
exhibited similar but distinguishable dynamics in the western 
English Channel (see Fig. S2B) and confirmed that taxa with sea- 
sonal or irregular dynamics did not have a b value, >0.90 (see 
Fig. S2C). Thus, this method identified known and unknown CRT 
but excluded taxa that did not have rare-to-prevalent dynamics. 

As each data set in this analysis had different sequencing ef- 
forts, sampling durations (numbers of days), and intensities 
(numbers of sampling events per unit of time) , it was important to 
determine how these affected the recoverable enumeration of 
CRT. To address this, we used three of the most comprehensive 
data sets available in terms of sequencing effort, study duration, 
and sampling intensity. The first data set was a human-skin- 
associated community (male M3, right palm, 8,230 taxa) sampled 
approximately daily for 1 year and sequenced with Illumina tech- 
nology (rarefied to 5,031 reads per sample). The second data set 
was a less rich temperate lake community (Trout Bog epilimnion, 
Wisconsin, 1,816 taxa) sampled periodically over 4 years with 
more intensive sampling during the ice-free season and sequenced 
with Illumina (rarefied to 5,134 reads per sample). The third data 
set comprised a marine surface water site in the western English 
Channel L4 (2,017 taxa) that was sampled approximately monthly 
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for 6 years and sequenced by 454 pyrosequencing (rarefied to 
3,526 reads per sample). We subsampled these time series along a 
range of sampling intensities and study durations and then calcu- 
lated the percentages of CRT (see Fig. S3 in the supplemental 
material). From these analyses, it is clear that sampling intensity 
has a greater influence on the detection of CRT than the study 
duration does. Analysis of the impact of the number of samples 
included in a study revealed the same pattern across ecosystems 
(see Fig. S3), suggesting that sampling intensity is the most critical 
factor and should be taken into consideration when designing 
studies to explore CRT dynamics. Temporal sampling intensity 
will be an ecosystem-specific parameter that depends on the an- 
ticipated rate of community turnover or average life span of mi- 
croorganisms in the system (30). See the supplemental material 
for additional considerations and recommendations for detecting 
CRT. 

As expected, the time series with fewer sequence reads per sam- 
ple had a higher percentage of CRT at a given sequencing depth 
(see Fig. S3E). This is because CRT made up a larger percentage of 
an inadequately sequenced community, which is an artifact of 
undersampling. The more undersampled the community, the 
larger the contribution of any taxon, including a conditionally 
rare taxon, will appear. Thus, unless a community is sequenced 
exhaustively and sampled at an intensity and duration appropriate 
for the community and range of environmental conditions in an 
ecosystem, the number of CRT detected will remain a conserva- 
tive estimate. 

CRT are ubiquitous and contribute disproportionately to 
community changes. Acknowledging that detection of CRT will 
be a conservative estimate and will improve with increasing sam- 
pling intensity and duration appropriate to the expected commu- 
nity turnover in an ecosystem, we applied our method to the time 
series spanning nine distinct ecosystems, 42 microbial communi- 
ties (consortia sampled at a given locality), and 3,237 individual 
observations. We found that each community included taxa that 
exhibited rare-to-prevalent dynamics. The incidence of CRT 
ranged from 1.5 to 28% of the total community membership 
(Fig. 1A) (lvalue, >0.90; relative abundance, >0.5%); however, it 
is important to note that when comparing CRT contributions to 
different ecosystems, these values should not be interpreted as 
absolute. To determine the contribution of CRT to the temporal 
dynamics of the community (temporal community dissimilarity), 
we calculated the fraction of Bray-Curtis similarity attributable to 
CRT (Fig. IB and 2; see Materials and Methods), which ranged 
from 0 to 97% of the total community dissimilarity between time 
points (Fig. 2). This is because when CRT were abundant, their 
dynamics often explained a large fraction of the community dis- 
similarity. Interestingly, some ecosystems, such as the human gut 
(Fig. 2C), exhibited relatively more punctuated contributions of 
CRT to community dissimilarity over time, while other ecosys- 
tems, such as air (Fig. 2A), exhibited a more consistent contribu- 
tion of CRT. 

CRT represented a broad range of phylogenetic diversity (see 
Fig. S4 in the supplemental material), with most environments 
being dominated by Proteobacteria, except for the infant and adult 
human guts, which were dominated by Firmicutes, and the human 
tongue, which had an equal contribution from Cyanobacteria 
(likely chloroplasts from food matter). There was no evidence that 
CRT consistently represented certain lineages when different eco- 
systems were compared (see Fig. S4 in the supplemental material). 
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FIG 1 Incidences of CRT and their contributions to community dissimilar- 
ity. (A) Incidences of CRT across different ecosystems. Error bars are standard 
deviations of the means, but none are reported when n = i time series. (B) 
Fraction of temporal community dissimilarity attributed to the dynamics of 
CRT. Each open diamond is the mean of an ecosystem, whiskers are the lower 
and upper quartiles, and closed circles show outliers, b value, >0.90; relative 
abundance, >0.5%. (C) CRT observed only once in a time series, when bloom- 
ing (one-hit wonders). WWT, wastewater treatment. 



Additionally, within a community, there were similar lineages 
represented among CRT and the whole community membership 
(see SOMinFig. S5). 

Again, because of the differences in sampling and sequencing 
strategies across data sets (28), we encourage readers to consider 
the general trends in CRT rather than absolute values. However, 
despite these nuances, these data show not only that CRT are 
widespread members of microbial communities but also that CRT 
contribute to community level temporal changes disproportion- 
ately to their relative abundances. 

Synchrony among CRT transitions. To determine whether 
multiple CRT were synchronized in their transitions from rarity to 
prevalence, we performed hierarchical cluster analyses. Within 
each community, we found discrete clusters of CRT that shared 
the same occurrence patterns over time, as well as some CRT that 
had occurrence patterns that were independent and did not occur 
with other CRT (Fig. 3). These results suggest shared environmen- 
tal drivers or shared sources of dispersal for synchronous CRT (see 
SOM in Fig. S6 in the supplemental material). 
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FIG 2 The fraction of consecutive Bray-Curtis dissimilarity attributed to the dynamics of CRT in representative communities. A, air time series, site Sp., 
670 days; B, brewery wastewater treatment (WWT), site U4, 305 days; C, human gut, site M3 (male), 442 days; D, human skin, site F4 right palm (female), 
185 days; E, infant gut, 834 days; F, freshwater bog lake, site TBE, 1,545 days; G, marine, western English Channel, site L4, 2,156 days; H, freshwater stream, site 
Orodell, 462 days. 



One-hit wonders: can we attribute CRT to large dispersal 
events? One mechanism of CRT dynamics could be the immigra- 
tion and temporary bloom of a foreign taxon. In our data sets, this 
would be indicated by a taxon that was below the limit of detec- 
tion, achieved abundance at one time point, and subsequently 
returned to undetectability; we refer to taxa exhibiting this dy- 
namic as "one-hit wonders." We wanted to understand how many 
CRT could be designated one-hit wonders, which would allow us 
to refine hypotheses about the potential for immigration events to 
affect community dynamics. We found that while the majority of 
CRT were detected at multiple time points, a subset of CRT were 
detected only when they bloomed, possibly because of immigra- 
tion followed by a bloom and a crash (one-hit wonders: median, 
9% of detected CRT; minimum, 0%; maximum, 53%, Fig. 1C). 
Generally, those communities that had relatively higher levels of 
temporal variability (i.e., air and stream communities) had more 
one-hit wonders than communities that were more stable (i.e., 
lake hypolimnia) (28). An exception to this were the brewery 
wastewater treatment communities, which were relatively stable 
but had high percentages of one-hit wonders; however, this time 
series also had a low sampling intensity, which could contribute to 
an increase in CRT as discussed above. Another scenario is that a 
one-hit wonder was always present but below the limit of detec- 
tion. Because the percentage of one-hit wonders was moderately 
correlated to the sampling intensity (Pearson's correlation coeffi- 
cient, —0.51; P = 0.0005), longer or more intensely sampled time 
series may reveal multiple occurrences of a conditionally rare 
taxon that was originally designated a one-hit wonder. 

Unraveling CRT ecology by comparing time series to distur- 
bance events. We propose two classifications of CRT: those that 
contribute to community dynamics given routine environmental 
changes (e.g., seasonal changes) (31) and those that contribute 
after a drastic disturbance. We distinguished these two classifica- 
tions of CRT in a temperate lake microbial community that was 
the object of a whole-ecosystem disturbance experiment (32). The 
community was observed over the ice-free seasons in 2007, 2008, 
and 2009, and the disturbance experiment was conducted in July 
2008. Using the temporal study as a baseline to understand rou- 



tine dynamics, we could determine CRT that were important for a 
community response to the disturbance, helping to understand 
ecological drivers of CRT. 

In this study, the epilimnion and hypolimnion thermal layers 
of a small bog lake (North Sparkling Bog, Wisconsin) were forced 
to mix at peak summer stratification (July 2008) with two large 
membranes that oscillated in the water column over the deepest 
point of the lake for 8 days until thermal homogeneity was 
achieved. The epilimnion was warm and oxygenated and had high 
light penetration, while the hypolimnion was cold and anaerobic 
and had low light penetration. Usually, thermal stratification 
weakens every spring and autumn as cool air causes the epilim- 
nion temperature to decrease and meet the hypolimnion temper- 
ature, initiating seasonal mixing. Previous work showed that the 
microbial community structure and chemistry recovered to their 
predisturbance state within 20 days after the forced mixing in 
summer and that the hypolimnion community was more sensitive 
to mixing than the epilimnion (32). Therefore, we focused on the 
response of hypolimnion CRT to the forced mixing in summer. 

A total of 24 CRT (b value, >0.90; relative abundance, >0.005; 
see SOM in the supplemental material) were detected in the hy- 
polimnion of North Sparkling Bog between 2007 and 2009. 
Changes in the abundance of these 24 CRT could be described by 
four distinct patterns of increased relative abundance: (i) re- 
sponding to natural and forced mixing events (Fig. 4A), (ii) re- 
sponding only to the forced mixing event (Fig. 4B), (hi) respond- 
ing only to the natural mixing events (Fig. 4C), and (iv) at times 
that were not defined by any type of mixing (Fig. 4D). The first 
group of CRT were probably driven by key environmental condi- 
tions associated with the phenomenon of mixing (i.e., oxygen- 
ation of the hypolimnion, redistribution of nutrients in the water 
column) and were unaffected by seasonal differences. This dy- 
namic suggests that these CRT were rare but always present. The 
second group of CRT likely gained a competitive advantage given 
the novel environmental conditions caused by the forced mixing 
in summer. For example, OTU 333636, a member of the deltapro- 
teobacterial family Haliangiaceae (Fig. 4B), bloomed after anaer- 
obic conditions had been reestablished in the hypolimnion imme- 
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diately following the forced mixing event but while the 
temperature remained elevated above typical seasonal averages 
(31), suggesting that this CRT thrives in warm and anaerobic wa- 
ters, which would never have occurred during natural mixing 
events. The third group of CRT likely had seasonal constraints, 



and the fourth group of CRT may use multiple strategies to adapt 
to an increasing biomass, or similar but unmeasured environmen- 
tal niches were established periodically. Included in the fourth 
group was a one-hit wonder that did not increase during the 
forced mixing in summer. 
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FIG 4 Representative dynamics of CRT from the North Sparkling Bog hy- 
polimnion, observed over three ice-free seasons (2007 to 2009) that included a 
whole-ecosystem mixing experiment in July 2008 (shaded in gray, between 
dashed lines). Bloom events are purple points. Samples collected during fall 
mixing are underlined in red on the x axis, those collected during spring mix- 
ing are underlined in green, and those collected under ice are underlined in 
blue. A, Sphingobacteriaks OTU 70346; B, Haliangiaceae OTU 333636; C, Fla- 
vobacterium OTU 426 1 08; D, Moraxellaceae OTU 584 1 76. Note the differences 
in y axis ranges. 



DISCUSSION 

Our results show that CRT can influence changes in microbial 
community structure. CRT contributed from 0 to 97% of the vari- 
ability in the observed temporal community dissimilarity. 
Though it may seem obvious that CRT would contribute the most 
to temporal community dissimilarity during their transitions, it 
was unexpected that they would contribute so disproportionately 
(i.e., up to 97%) compared with their relative abundance during a 
"bloom" (mean relative abundance during a bloom, 2.7%; me- 
dian, 1%). Our previous analysis suggested that the longer a com- 
munity is observed, the more the perceived magnitude of the 
changes in community structure is reduced, suggesting very low 
rates of community change over long-term observations (28). To- 
gether, these results indicate that many baseline temporal changes 
in bacterial and archaeal diversity may be attributed to changes in 
the relative contributions of taxa that already exist within the 
community, including CRT transitions. 

We provide a simple tool for identifying CRT and suggest that, 
on the whole, CRT comprise taxa that are always present and that 
it is less common for these taxa to be introduced by a dispersal 



event. However, while our strategy identifies taxa that can be tar- 
geted for further analysis, it does not explicitly reveal the ecologi- 
cal mechanisms of CRT within a community. These mechanisms 
are diverse and numerous, and determining the ecological prop- 
erties of individual taxa is difficult and costly (33-35). However, 
we provide one example in which we capitalized on a temporal 
lake study to deduce CRT ecology by contrasting routine dynam- 
ics with a disturbance. In doing so, we were able to distinguish 
CRT that responded to both natural and forced mixing events 
from those that responded only to a forced event. These methods 
provide a springboard for hypothesis generation and are useful for 
understanding the contributions of CRT to different types of eco- 
logical dynamics. For example, in the context of human microbial 
consortia, similar analyses may be done in instances of pathogen 
invasion or pathobiont formation to understand when, how, and 
under what environmental conditions a typically rare or invasive 
member of the human microbiome is able to thrive following such 
a disturbance. 

Though we cannot prove that one-hit-wonder CRT are not 
artifacts due to PCR (36) or sampling anomalies (37), the fact that 
the majority of CRT were observed multiple times within a series 
suggests that this scenario is not common and asserts that CRT 
would remain important contributors to community dynamics 
despite occasional misidentification due to artifacts. In reality, 
one-hit-wonder CRT likely comprise a combination of newly dis- 
persed taxa that fail to thrive long term, rare but persistent taxa 
that fall below the level of detection when not blooming, CRT that 
were not observed long enough to detect subsequent blooms (in- 
sufficient time series), and artifacts. 

There have been two distinct approaches to considering the 
rare biosphere in microbial ecology: (i) deep sequencing to detect 
as many rare members as possible (6) and (ii) omission of the 
entire rare "tail" to clarify overarching community patterns, 
whether arbitrary (e.g., 50 or fewer sequences) or methodological 
(e.g., after determining the abundance cutoff at which rare taxa do 
not contribute substantially to community dissimilarity) (38). Al- 
though the ecological roles of many rare taxa are unknown, it has 
been suggested that rare taxa are not necessarily important for the 
comparison and interpretation of microbial community patterns 
(10, 38). As more data from temporal studies of microbial com- 
munities are collected, it is likely that the dynamics of CRT will 
play an increasingly important role in our understanding of both 
the subtle temporal variability (39) and the disturbance responses 
of microbial communities. Furthermore, we know that some rare 
taxa play critical ecological roles in ecosystems, for example, di- 
azotrophs in seawater (40), bacterial and archaeal ammonia oxi- 
dizers in soils (41, 42), and methanogens in guts (43). Thus, de- 
tection of CRTs will provide clues as to the identities of rare taxa 
that play previously unknown but equally critical ecological roles. 
Finally, studies that use unsaturated sequencing efforts to infer 
community assembly rules may attribute the appearance of new 
taxa to dispersal, when these taxa may instead already persist in the 
community in low abundance or in a dormant state (24). There- 
fore, close inspection of CRT dynamics in sufficiently sequenced 
communities will provide insights into the different roles of dis- 
persal and blooms in community dynamics. 

Given the ubiquity of CRT detected across an array of ecosys- 
tems and the large contribution of CRT to community dissimilar- 
ity, our results show that rare-to-prevalent dynamics are generally 
important and that these dynamics are especially critical for the 
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community at the time points of CRT transitions. These data pro- 
vide evidence that not all of the members of the microbial rare 
biosphere are always rare but that many contribute to the larger 
community at key time points. Furthermore, our analysis revealed 
synchronous dynamics of many CRT within a community and 
suggests that some CRT may be indicators of environmental 
changes that are unmeasured, providing clues about the identities 
of more subtle physical, chemical, or biological drivers of micro- 
bial dynamics. Finally, as transient members of the rare biosphere, 
CRT likely contribute to the high alpha diversity observed in many 
microbial communities. 

MATERIALS AND METHODS 

The microbial time series used in this study were previously published as 
separate studies (12, 25, 44-47), except for the lake data set, which is 
available from the Earth Microbiome Project (http:// 
www.earthmicrobiome.org) (48). The whole-lake manipulation, includ- 
ing physical and chemical lake conditions, was described previously (32). 
The descriptions, quality control, and normalization of these data sets also 
are detailed elsewhere (28). OTUs were defined at 97% sequence identity 
of the 16S rRNA gene. We chose to include these 42 time series because 
they had study durations of at least 60 days. Because microbial commu- 
nities have different degrees of richness, relative abundances were used 
when comparing community members. The overarching patterns of CRT 
were robust when different thresholds were used for the coefficient of 
bimodality and maximum relative abundance (see SOM in Fig. S7 in the 
supplemental material). 

The study duration was the total number of days spanning the time 
series collection. Sampling intensity was the average number of days be- 
tween observations. The influences of study duration and sampling inten- 
sity on the detection of CRT were investigated by subsampling the human 
male M3 gut, freshwater lake Trout Bog epilimnion, and marine L4 west- 
ern English Channel time series by a "moving- window" approach (49). 
This approach involves the partitioning of a time series into as many 
window subsets as possible given the number of observations and calcu- 
lation of the number of CRT detected within each window. For example, 
a 250-time-point series would first be divided into one 250-point window, 
two 249-point windows, three 248-point windows, etc. Subsampling of a 
data set to fewer sequences per sample (rarefaction) was performed by 
using the multiple_rarefactions.py script in QIIME v. 1.6.0 (50). We also 
rarefied the observed taxa classified as CRT by generating replicated, sub- 
sampled data sets at systematically varied sampling effort (i.e., number of 
samples). The percentage of CRT was calculated for each subsampled data 
set as described above. To extrapolate rarefaction curves to a standard 
sample size, the three parameters of the function 



(b + c) 

were estimated by maximum likelihood using custom scripts in R. 

The R environment for statistical computing v 2.15.2 was used for all 
other analyses (51). Hierarchical clustering of CRT (to determine syn- 
chronous responses) was performed as described previously ( 10), by using 
dynamics of CRT standardized for each time series and k-means cluster- 
ing of common occurrence patterns. To assess whether the subset of CRT 
represented a composition or structure different from that of the whole 
community, we used Pearson's product-moment correlation. Some plots 
were made in R with the ggplot2 package (52). We calculated Bray-Curtis 
dissimilarity as a metric of community dissimilarity as follows: 



BQ 



S|Xjj - x, k 



2(X„- - X ik ) 

where BC is the Bray-Curtis dissimilarity between communities j and k 
and X is the relative abundance of taxon i. For each time series, we calcu- 
lated the Bray-Curtis similarity of all of the samples and then calculated 



the dissimilarity attributed to the taxa that were identified as conditionally 
rare (b value, >0.90; relative abundance, SO. 05%). Because the Bray- 
Curtis dissimilarity is a scaled summation of abundance differences be- 
tween two communities, we can easily partition Bray-Curtis dissimilarity 
between two samples attributable to a subset of the community. To do so, 
we use only CRT when calculating the summation in the numerator of the 
Bray-Curtis dissimilarity expression but use all of the taxa when calculat- 
ing the scaling summation in the denominator. In this way, the Bray- 
Curtis dissimilarity of CRT and non-CRT will sum to the Bray-Curtis 
dissimilarity of the whole sample. We then divided the Bray-Curtis dis- 
similarity of CRT by the total community Bray-Curtis dissimilarity to 
report the fraction of beta diversity attributed to CRT. R scripts for calcu- 
lation of CRT are freely available on GitHub (53). 
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